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Abstract 

The advent of multipoint (multicast-based) applications and the growth and complexity of the In- 
ternet has complicated network protocol design and evaluation. 

In this paper, we present a method for automatic synthesis of worst and best case scenarios for 
multipoint protocol performance evaluation. Our method uses a fault-oriented test generation (FOTG) 
algorithm for searching the protocol and system state space to synthesize these scenarios. The algorithm 
is based on a global finite state machine (FSM) model. We extend the algorithm with timing semantics 
to handle end-to-end delays and address performance criteria. We introduce the notion of a virtual 
LAN to represent delays of the underlying multicast distribution tree. 

As a case study, we use our method to evaluate variants of the timer suppression mechanism, used in 
various multipoint protocols, with respect to two performance criteria: overhead of response messages 
and response time. Simulation results for reliable multicast protocols show that our method provides a 
scalable way for synthesizing worst-case scenarios automatically. We expect our method to serve as a 
model for applying systematic scenario generation to other multipoint protocols. 

1. INTRODUCTION 

The longevity and power of Internet technologies derives from its ability to operate under a wide range of operating 
conditions (underlying topologies and transmission characteristics, as well as heterogeneous applications generating 
varied traffic inputs). Perhaps more than any other technology, the range of operating conditions is enormous (it is 
the cross product of the top and bottom of the IP protocol stack) . 

Perhaps it is this enormous set of conditions that has inhibited the development of systematic approaches to 
analyzing Internet protocol designs. How can we test correctness or characterize performance of a protocol when the 
set of inputs is intractable? Nevertheless, networking infrastructure is increasingly critical and there is enormous need 
to increase the robustness and understanding of network protocols. It is time to develop techniques for systematic 
testing of protocol behavior, even in the face of the above challenges and obstacles. At the same time we do not 
expect that complex adaptive protocols will be automatically verifiable under their full range of conditions. Rather, 
we are proposing a framework in which a protocol designer can follow a set of systematic steps, assisted by automation 
where possible, to cover a specific part of the design and operating space. 

In our proposed framework, a protocol designer will still need to create the initial mechanisms, describe it in 
the form of a finite state machine, and identify the performance criteria or correctness conditions that need to be 
investigated. Our automated method will pick up at that point, providing algorithms that eventually result in 
scenarios or test suites that stress the protocol with respect to the identified criteria. 

This paper demonstrates our progress in realizing this vision as we present our method and apply it to the 
performance evaluation of multipoint protocols. 
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1.1. MOTIVATION 

The recent growth of the Internet and its increased heterogeneity has introduced new failure modes and added 
complexity to protocol design and testing. In addition, the advent of multipoint applications has introduced new 
challenges of qualitatively different nature than the traditional point-to-point protocols. Multipoint applications 
involve a group of receivers and one or more senders. As more complex multipoint applications and protocols are 
coming to life, the need for systematic and automatic methods to study and evaluate such protocols is becoming more 
apparent. Such methods aim to expedite the protocol development cycle and improve resulting protocol robustness 
and performance. 

Through our proposed methodology for test synthesis, we hope to address the following key issues of protocol 
design and evaluation. 

■ Scenario dependent evaluation, and the use of validation test suites: Protocols may be evaluated for correctness 
and performance. In many evaluation studies of multipoint protocols, the results are dependent upon several 
factors, such as membership distribution and network topology. Hence, conclusions drawn from these studies 
depend heavily upon the evaluation scenarios. 

Protocol development usually passes through iterative cycles of refinement, which requires revisiting the eval- 
uation scenarios to ensure that no erroneous behavior has been introduced. This brings about the need for 
validation test suites. Constructing these test suites can be an onerous and error-prone task if performed manu- 
ally. Unfortunately, little work has been done to automate the generation of such tests for multipoint network 
protocols. In this paper, we propose a method for synthesizing test scenarios automatically for multipoint 
protocol evaluation. 

■ Worst-case analysis of protocols: It is difficult to design a protocol that would perform well in all environments. 
However, identifying breaking points that violate correctness or exhibit worst-case performance behaviors of 
a protocol may give insight to protocol designers and help in evaluating design trade-offs. In general, it is 
desirable to identify, early on in the protocol development cycle, scenarios under which the protocol exhibits 
worst or best case behavior. 

The method presented in this paper automates the generation of scenarios in which multipoint protocols exhibit 
worst and best case behaviors. 

■ Performance benchmarking: New protocols may propose to refine a mechanism with respect to a particular 
performance metric, using for evaluation those scenarios that show performance improvement. However, 
without systematic evaluation, these refinement studies often (though unintentionally) overlook other scenarios 
that may be relevant. To alleviate such a problem we propose to integrate stress test scenarios that provide 
an objective benchmark for performance evaluation. 

Using our scenario synthesis methodology we hope to contribute to the understanding of better performance 
benchmarking and the design of more robust protocols. 

1.2. BACKGROUND 

The design of multipoint protocols has introduced new challenges and problems. Some of the problems are 
common to a wide range of protocols and applications. One such problem is the multi-responder problem, where 
multiple members of a group may respond (almost) simultaneously to an event, which may cause a flood of messages 
throughout the network, and in turn may lead to synchronized responses, and may cause additional overhead (e.g., 
the well-known Ack implosion problem), leading to performance degradation. 

One common technique to alleviate the above problem is the multicast damping technique, which employs a timer 
suppression mechanism (TSM). TSM is employed in several multipoint protocols, including the following: 

■ IP- multicast protocols, e.g., PIM [1] [2] and IGMP [3], use TSM on LANs to reduce Join/Prune control 
overhead. 

■ Reliable multicast schemes, e.g., SRM [4] and MFTP [5], use this mechanism to alleviate Ack implosion. 
Variants of the SRM timers are used in registry replication (e.g., RRM [6]) and adaptive web caching [7]. 

■ Multicast address allocation schemes, e.g., AAP [8] and SDr [9], use TSM to avoid an implosion of responses 
during the collision detection phase. 

■ Active services [10] use multicast damping to launch one service agent 'servent' from a pool of servers. 
TSM is also used in self-organizing hierarchies (SCAN [11]), and transport protocols (e.g., XTP [12] and RTP [13]). 
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We believe TSM is a good building block to analyze as our first end-to-end case study, since it is rich in multicast 
and timing semantics, and can be evaluated using standard performance criteria. As a case study, we examine its 
worst and best case behaviors in a systematic, automatic fashion 1 . 

In TSM, a member of a multicast group that has detected loss of a data packet multicasts a request for recovery. 
Other members of the group, that receive this request and that have previously received the data packet, schedule 
transmission of a response. In general, randomized timers are used in scheduling the response. While a response timer 
is running at one node, if a response is received from another node then the response timer is suppressed to reduce 
the number of responses triggered. Consequently, the response time may be delayed to allow for more suppression. 

Two main performance evaluation criteria used in this case are overhead of response messages and time to recover 
from packet loss. Depending on the relative delays between group members and the timer settings, the mechanism 
may exhibit different performance. In this study, our method attempts to obtain scenarios of best case and worst 
case performance according to the above criteria. 

We are not aware of any related work that attempts to achieve this goal systematically. However, we borrow from 
previous work on protocol verification and test generation. Related work is presented in Section 8. 

The rest of the paper is organized as follows. Section 2 introduces the protocol and topology models. Section 3 
outlines the main algorithm, and Section 4 presents the model for TSM. Sections 5 and 6 present performance 
analyses for protocol overhead and response time, and Section 7 presents simulation results. Related work is given 
in Section 8. Issues and future work are discussed in Section 9. We present concluding remarks in Section 10. 
Algorithmic details, mathematical models and example case studies are given in the appendices 2 . 

2. THE MODEL 

The model is a processable representation of the system under study that enables automation of our method. Our 
overall model consists of: A) the protocol model, B) the topology model, and C) the fault model. 

The Protocol Model. We represent the protocol by a finite state machine (FSM) and the overall system 
by a global FSM (GFSM). 

/. FSM model: Every instance of the protocol, running on a single end-system, is modeled by a deterministic FSM 
consisting of: (i) a set of states, (ii) a set of stimuli causing state transitions, and (iii) a state transition function (or 
table) describing the state transition rules. A protocol running on an end-system i is represented by the machine 
Mi = (Si, n,Si), where Si is a finite set of state symbols, n is the set of stimuli, and 6i is the state transition function 

Si X Ti ► Si . 

II. Global FSM model: The global state is defined as the composition of individual end-system states. The behavior 
of a system with n end-systems may be described by A4g = (Sg,rg, 5g), where Sg: Si x S2 x • • • x S n is the global 

n 

state space, rg: (J n is the set of stimuli, and Sg is the global state transition function Sg x rg — > Sg. 

i=l 

The Topology Model. The topology cannot be captured simply by one metric. Indeed, its dynamics may 
be complex to model and sometimes intractable. We model the delays using the delay matrix and loss patterns using 
the fault model. We use a virtual LAN (VLAN) model to represent the underlying network topology and multicast 
distribution tree. The VLAN captures delay semantics using a delay matrix D (see Figure 1), where dij is the delay 
from system i to system j. 

The Fault Model. A fault is a low level (e.g., physical layer) anomalous behavior that may affect the 

protocol under test. Faults may include packet loss, system crashes, or routing loops. For brevity, we only consider 
selective packet loss in this study. Selective packet loss occurs when a multicast message is received by some group 
members but not others. The selective loss of a message prevents the transition that this message triggers at the 
intended recipient. 



1 Such behavior is not protocol specific, and if a protocol is composed of previously checked building blocks, these parts of the 
protocol need not be revalidated in full. However, interaction between the building blocks still needs to be validated. 
2 Appendices are added for clarification and completeness of the analysis, but may be removed -without loss of cohesion- to 
adhere to the page limit. 
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Figure 1 The virtual LAN and the delay matrix 



3. 



ALGORITHM AND OBJECTIVES 



To apply our method, the designer specifies the protocol as a global FSM model. In addition, the evaluation 
criteria, be it related to performance or correctness, are given as input to the method. In this paper we address 
performance criteria, correctness has been addressed in previous studies [14, 15]. The algorithm operates on the 
specified model and synthesizes a set of 'test scenarios'; protocol events and relations between topology delays and 
timer values, that stress the protocol according to the evaluation criteria (e.g., exhibit maximum overhead or delay). 
In this section, we outline the algorithmic details of our method. The algorithm is further discussed in section 5 and 
illustrated by a case study. 



Our algorithm is a variant of the fault-oriented test generation (FOTG) algorithm presented in [15]. It includes 
the topology synthesis, the backward search and the forward search stages. Here, we describe those aspects of our 
algorithm that deal with timing and performance semantics. The basic algorithm passes through three main steps 
(1) the target event identification, (2) the search, and (3) the task specific solution. 

1 The target event: The algorithm starts from a given event, called the 'target event'. The target event (e.g., 
sending a message) is identified by the designer based on the protocol evaluation criteria, e.g., overhead. 

2 The search: Three steps are taken in the search: (a) identifying conditions, (b) obtaining sequences, and 
(c) formulating inequalities. 



(a) Identifying conditions: The algorithm uses the protocol transition rules to identify transitions necessary 
to trigger the target event and those that prevent it, these transitions are called wanted transitions and 
unwanted transitions, respectively. 

(b) Obtaining sequences: Once the above transitions are identified, the algorithm uses backward and forward 
search to build event sequences leading to these transitions and calculates the times of these events as 
follows. 

i Backward search is used to identify events preceding the wanted and unwanted transitions, and 
uses implication rules that operate on the protocol's transition table. Section 4.2 describes the 
implication rules. 

ii Forward search is used to verify the backward search. Every backward step must correspond to 
valid forward step(s). Branches leading to contradictions between forward and backward search are 
rejected. Forward search is also used to complete event sequences necessary to maintain system 
consistency 3 . 



3.1. 



ALGORITHM OUTLINE 



3 The role of forward search will be further illustrated in the response time analysis in Section 6. 
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(c) Formulating inequalities: Based on the transitions and timed sequences obtained in the previous steps, 
the algorithm formulates relations between timer values and network delays that trigger the wanted 
transitions and avoid the unwanted transitions. 

3 Task specific solution: The output of the search is a set of event sequences and inequalities that satisfy 
the evaluation criteria. These inequalities are solved mathematically to find a topology or timer configuration, 
depending on the task definition. 

3.2. TASK DEFINITION 

We apply our method to two kinds of tasks: 

1 Topology syntehsis is performed when the timer values are given, and the objective is to identify the delay 
matrix that produces the best or worst case behavior. 

2 Timer configuration is performed when the topology or delay matrix is given, and the timer values that 
cause the best and worst case behavior are to be determined. 

4. THE TIMER SUPPRESSION MECHANISM (TSM) 

In this section, we present a simple description of TSM, then present its model, used thereafter in the analysis. 
TSM involves a request q and one or more responses p. When a system Q detects the loss of a data packet it sets a 
request timer and multicasts a request q. When a system i receives q it sets a response timer (e.g., randomly), the 
expiration of which, after duration Expi, triggers a response p. If the system i receives a response p from another 
system j while its timer is running, it suppresses its own response. 

4.1. PERFORMANCE EVALUATION CRITERIA 

We use two performance criteria to evaluate TSM: 

1 Overhead of response messages, where the worst case produces the maximum number of responses per data 
packet loss. As an extreme case, this occurs when all potential responders do indeed respond and no suppression 
takes place. 

2 The response delay, where worst case scenario produces maximum loss recovery time. 

4.2. TIMER SUPPRESSION MODEL 

Following is the TSM model used in the analysis. 
Protocol states (S). Following is the state symbol table for the TSM model. 



State 


Meaning 


R 


original state of the requester Q 


R T 


requester with the request timer set 


D 


potential rcsponder 


D T 


rcspondcr with the response timer set 



Stimuli or Events. 

1 Sending/receiving messages: transmitting response (p t ) and request (q t ), receiving response (p r ) and request 

(<7r). 

2 Timer and other events: the events of firing the request timer Req and response timer Res and the event of 
detecting packet loss L. 

Notation. Following are the notations used in the transition table. 

■ An event subscript denotes the system initiating the event, e.g., pt t is response sent by system i, while the 
subscript m denotes multicast reception, e.g., p Tm denotes reception of a response by all members of the group 
if no loss occurs. When system i receives a message sent by system j, this is denoted by the subscript i,j, e.g., 
Pnj is system i receiving response from system j. 
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■ The state subscript T denotes the existence of a timer, and is used by the algorithm to apply the 'timer 
implication' to fire the timer event after the expiration period. 

■ A state transition has a start state and an end state and is expressed in the form startState — + endState (e.g. 
D — > Dt). It implies the existence of a system in the startState (i.e., D) as a condition for the transition to 
the endState (i.e., Dt). 

■ An effect in the transition table may contain state transition and stimulus in the form (startState — ► 
endState). stimulus, which indicates that the condition for triggering stimulus is the state transition. An 
effect may contain several transitions (e.g., 'Trans 1, Trans2'), which means that out of these transitions only 
those with satisfied conditions will take effect. 



Transition Table. Following is the transition table for TSM. 



Symbol 


Event 


Effect 




Meaning 


loss 


L 


(*-> 


Rr)-qt 


loss detection causes qt and setting of request timer 


tx_rcq 


qt 






transmission of q causes multicast reception of q after network delay 


rcv_rcq 


q r 


D -> 


Dt 


reception of q causes a system in D state to set response timer 


res.tmr 


Res 


(D T - 


- D).p t 


response timer expiration causes transmission of p and a change to D state 


tx_res 


Pt 


Prm 




transmission of p causes multicast reception of p after network delay 


rcv_res 


Pr 


R T — 


R, D T -> D 


reception of p by a system with the timer set causes suppression 


rcq_tmr 


Req 


qt 




expiration of request timer causes transmission of q 



The model contains one requester Q and several potential responders (e.g., i and j). 4 Initially, the requester Q 
exists in state R and all potential responders exist in state D. Let to be the time at which Q sends the request q. 
The request sent by Q is received by i and j at times dQ t i and dQj, respectively. When the request q is sent, the 
requester transitions into state Rt by setting the request timer. Upon receiving a request, a potential responder 
in state D transitions into state Dt, by setting the response timer. The time at which an event occurs is given by 
t(event), e.g., q rj occurs at t(q rj ). 5 

Implication Rules. The backward search uses the following cause-effect implication rules: 

1 Transmission/Reception (Tx_Rcv): By the reception of a message, the algorithm implies the transmission of 
that message -without loss- sometime in the past (after applying the network delays). An example of this 
implication is p Tlj <= p tj , where t(p Titj ) = t(p tj ) + dj,i. 

2 Timer Expiration (Tmr_Exp): When a timer expires, the algorithm infers that it was set Exp time units in 
the past, and that no event occurred during that period to reset the timer. An example of this implication 
is ReSi.(Di <— -DtJ <= D^, where t(ReSi) = t(DT t ) + Expi, and Expi is the duration of the response timer 
Resi 6 

3 State Creation (St_Cr): A state is created from another by reversing the transition rules and going towards 
the startState of the transition. For example, Dt, <= (D^ <— D;). 

In the following sections we use the above model to synthesize worst and best case scenarios according to protocol 
overhead and response time. 

5. PROTOCOL OVERHEAD ANALYSIS 

In this section, we conduct worst and best case performance analyses for TSM with respect to the number of 
responses triggered per packet loss. Initially, we assume no loss of request or response messages until recovery, and 
that the request timer is high enough that the recovery will occur within one request round. The case of multiple 
request rounds is discussed in Appendix 3. 



4 Since there is only one requester, we simply use qt instead of qt Q , and q ri instead of q ri Q . 

5 The time of a state is when the state was first created, so t(D^) is the time at which i transited into state Dt- 
6 We use the notation Event. Ef feet to represent a transition. 
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5.1. WORST-CASE ANALYSIS 

Worst-case analysis aims to obtain scenarios with maximum number of responses per data loss. In this section 
we present the algorithm to obtain inequalities that lead to worst-case scenarios. These inequalities are a function of 
network delays and timer expiration values. 

Target event and conditions. Since the overhead in this case is measured as the number of response 
messages, the designer identifies the event of triggering a response p t as the target event, and the goal is to maximize 
the number of response messages. 

The search. As previously described in section 3.1, the main steps for the search algorithm are: 

1 Identifying the wanted and unwanted transitions. 

2 Obtaining sequences leading to the above transitions, and calculating the times for these sequences. 

3 Formulating the inequalities that achieve the time constraints required to invoke wanted transitions and avoid 
unwanted transitions. 

Following, we apply these steps to our case study. 

■ Identifying conditions: The algorithm searches for the transitions necessary to trigger the target event, 
and their conditions, recursively. These are called wanted transitions and wanted conditions, respectively. The 
algorithm also searches for transitions that nullify the target event or invalidate any of its conditions. These 
are called unwanted transitions. 

In our case the target event is the transmission of a response (i.e, p t ). From the transition table described 
in Section 4.2, the algorithm identifies transition res_tmr [Rcs.(Dt — > D).pt] as a wanted transition and its 
condition Dt as a wanted condition. Transition rcvjreq \q r .D — > Dt] is also identified as a wanted transition 
since it is necessary to create D T . The unwanted transition is identified as transition rcvjres [p r .D T — ► D] 
since it alters the Dt state without invoking p t . 

■ Obtaining sequences: Using backward search, the algorithm obtains sequences and calculates time values 
for the following transitions: (1) wanted transition, resjmr, (2) wanted transition rcvjreq, and (3) unwanted 
transition rcvjres, as follows: 

1 To obtain the sequence of events for transition res_tmr, the algorithm applies implication rules (see 
Section 4.2) Tmr_Exp, St_Cr, Tx_Rcv in that order, and we get 

Res l .(D l <- D Ti ).p u 9r t .(DT t <- A) <S= qt Q - 
Hence the calculated time for t(pti) becomes 

t(pt 4 ) = to + dq^ + Expi, 

where to is the time at which qt Q occurs. 

2 To obtain the sequence of events for transition rcvjreq the algorithm applies implication rule Tx_Rcv, 
and we get 

a r % .(D T% <- Di)<=q tQ . 
Hence the calculated time for t{q ri ) becomes 

t(q ri ) = t + d Q:i . 

3 To obtain sequence of events for transition rcvjres for systems i and j the algorithm applies implication 
rules Tx_Rcv,Tmr_Exp, St_Cr, Tx_Rcv in that order, and we get 

Pn, y {Di < _Dt ; ) Res 3 .{D 3 <- D Tj ).p tj <= qr r (D Tj <- Dj) <S= q tQ . 
Hence the calculated time for t(p ri j ) becomes 
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t (Pr t , j ) = to + dQj + Expj + d jti . 

Formulating Inequalities: Based on the above wanted and unwanted transitions the algorithm avoids 
transition reveres while invoking transition res_tmr to transit out of Dt- To achieve this, the algorithm 
automatically derives the following inequality (see Appendix 1 for more details): 



t(Pt t ) < t(Pr itj )- 

Substituting expressions for t(ptj) and t(p riJ ) previously derived, we get: 

d Q>i + Expi < d Q: j + Exp j + d jti . 



(1) 



In other words, VL < VL + 



where Vt, 



+ Expi. Vt i is the time required for system i to trigger a 



response transmission (if any). 

Alternatively, we can avoid the unwanted transition reveres if the system did not exist in Dt when the 
response is received. Hence, the algorithm automatically derives the following inequality (see Appendix 1 for 
more details): 



*(Pr,j) <t(q ri ). (2) 
Again, substituting expressions derived above, we get: 

dQ,i > dgj + Expj + dj.i. 

Note that equations (1) and (2) are general for any number of responders, where i and j are any two responders 
in the system. Figure 2 (a) and (b) show equations (1) and (2), respectively. 



Exp, 



, -In Pu 

1 — ^ k— 

\Qr2 Pl2\ Pr2,l 

2 ; 1 1 -I* 



(a) 

t(Pa) < %2,i) 





Exp, 


d, 2 




qt 








te — 
<j 


"... I 


>tl 






Pr2 


i q t 2 pa 
PI > 




d Q. 




JExpzJ 




\2 \q t 2 \Pi2,l 

-> : 1 

J ^ dQ,2 ^ Exp, ^ 



(b) 

t(Pfl,i) < t(q r2 ) 



(c) 

t(p c ) > t(p r2 ,,) 

t(Pr2,l) > t(q fi ) 



Figure 2 Possible event sequencing: (a) and (b) sequences do not lead to suppression, while (c) leads to timer 
suppression. 
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Task specific solutions. 

■ Topology synthesis: Given the timer expiration values or ranges, we want to find a feasible solution for the 
worst-case delays. A feasible solution in this context means assigning positive values to the delays di,jVi,j. 
In equation (1) above, if we take d<p,i = cIqj 7 , we get: 

Expi — Expj < dj,i. 

These inequalities put a lower limit on the delays dj t i, hence, we can always find a positive dj,i to satisfy the 
inequalities. 

Note that, the delays used in the delay matrix reflect delays over the multicast distribution tree. In general, 
these delays are affected by several factors including the multicast and unicast routing protocols, tree type 
and dynamics, propagation, transmission and queuing delays. One simple topology that reflects the delays of 
the delay matrix is a completely connected network where the underlying multicast distribution tree coincides 
with the unicast routing. 

■ Timer configuration: Given the delay values or ranges (i.e., bounds), we want to obtain timer expiration 
values that produce worst-case behavior. 

We can obtain a range for the relative timer settings (i.e., Expi — Expj) using equation (1) above. 

The solution for the system of equations given by (1) and (2) above can be solved in the general case using linear 
programming (LP) techniques (see Appendix 2 for more details). Section 7 uses the above solutions to synthesize 
simulation scenarios. 

Note, however, that it may not be feasible to satisfy all these constraints, due to upper bounds on the delays for 
example. In this case the problem becomes one of maximization, where the worst-case scenario is one that triggers 
maximum number of responses per packet loss. This problem is discussed in Appendix 2. 

5.2. BEST-CASE ANALYSIS 

Best case overhead analysis constructs constraints that lead to maximum suppression, i.e., minimum number of 
responses. The following conditions are formulated using steps similar to those given in the worst-case analysis: 

*(P**)>*(Pt (j ), (3) 

and 

t( P r itj )>t(q ri ). (4) 

These are complementary conditions to those given in the worst case analysis. Figure 2 (c) shows equations (3) 
and (4). Refer to the Appendix 1 for more details on the inequality derivation 8 . 

In this section, we have described the algorithm to construct worst and best-case delay/timer relations for overhead 
of response messages. Solutions to these relations represent delay /timer settings for stress scenarios. 

6. RESPONSE TIME ANALYSIS 

In this section, we conduct the performance analysis with respect to the response time. For our analysis, we allow 
selective loss of a single response message during the recovery phase 9 . In this case, transition rules are applied to 
only those systems that receive the message. 

The algorithm obtains possible sequences leading to the target event and calculates the response time for each 
sequence. To synthesize the worst case scenario that maximizes the response time, for example, the sequence with 
maximum time is chosen. 

7 The number of inequalities (n 2 , where n is the number of responders) is less then the number of the unknowns dij (n 2 — n), 
hence there are multiple solutions. We can obtain a solution by assigning values to n unknowns (e.g., d,Q i) and solving for the 
others. 

8 Complete details of the best-case analysis and the task specific solutions were conducted and will be included in a more 
elaborate technical report. They are removed for brevity. 

9 Without loss of response messages this problem becomes one of maximizing the round trip delay from the requester to the 
first responder. 
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6.1. TARGET EVENT 

The response time is the time taken by the mechanism to recover from the packet loss, i.e., until the requester 
receives the response p and resets its request timer by transitioning out of the R T state. In other words, the response 
interval is t(p rQ ) — t(qt Q ) = t(p rQ ) — to- The designer identifies t(p rQ ) as the target time, hence, p rQ is the target 
event. 

6.2. THE SEARCH 

We present in detail the case of single responder, then discuss the multiple responders case. 

■ Backward search: As shown in Figure 3, the backward search starts from p TQ and is performed over the 
transition table (in Section 4.2) using the implication rules in Section 4.2, yielding 10 : 

Dj.p rQ .{R Q <- Rtq ) *S= Ptj - (Dj <- D Tj ).Resj.R TQ <= qr r {D Tj <- Dj).R TQ 

At which point the algorithm reaches a branching point, where two possible preceding states could cause q r . : 

— The first is transition loss [Dj.q tQ .(RT Q <— Rq)] and since the initial state Rq is reached, the backward 
search ends for this branch. 

— The second is transition reqjmr [Dj.ReqQ.qt Q .RT Q ]- Note that ReqQ indicates the need for a transition 
to Rtq , and the search for this last state yields eventually Dj.q tQ .(RT Q <— Rq)- 

Dj.p rQ .(R Q . R TQ ) 

t 

P .•!>.• D Tj ).Resj.R T Q 



q rj .(D JT < DjXRtq 



req_tmr 




Dj.Req .q tQ .R TQ Dj.q 1Q .(R TQ - Rq) 

t 

Dj.q,Q.(R T Q « — Rq) 



Figure 3 Backward search for response time analysis. 



■ Forward search: The algorithm performs a forward search and checks for consistency of the GFSM. 

The forward search step may lead to contradiction with the original backward search, causing rejection of that 
branch as a feasible sequence. For example, as shown in Figure 4, one possible forward sequence from the 
initial state gives: 

Dj.q tQ .(RQ ^Rt q )^> qr r (Dj D Tj ).Rt q =^p tj .(D Tj Dj).Resj.Rr Q 
The algorithm then searches two possible next states: 

— If ptj is not lost, and hence causes p rQ , then the next state is Dj.Rq. But the original backward search 
started from Dj.q tQ .ReqQ.RT Q which cannot be reached from Dj.Rq. Hence, we get contradiction and 
the algorithm rejects this sequence. 

— If the response p is lost by Q, we get Dj.Rr Q that leads to Dj .ReqQ.qt Q -Rt q ■ The algorithm identifies 
this as a feasible sequence. 



'The GFSM may be represented by composition of individual states (e.g., State\.State2 or transitioni.State2). 
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Dj.Rq Dj.R T Q 



J 



Dj.Req Q .qiQ.R TQ 



Figure 4 Forward search for response time analysis. 



Calculating the time for each feasible sequence, the algorithm identifies the latter sequence as one of maximum 
response time. 

For multiple responders, the algorithm automatically explores the different possible selective loss patterns of 
the response message. The search identified the sequence with maximum response as one in which only one responder 
triggers a response that is selectively lost by the requester. To construct such a sequence, the algorithm creates 
conditions and inequalities similar to those formulated for the best-case overhead analysis with respect to number of 
responses (see Section 5.2). 



To evaluate the utility of our method, we have conducted a set of simulations for the scalable reliable multicast 
(SRM) [4] based on our worst-case scenario synthesis results for the timer-suppression mechanism. We tied our 
method to the network simulator (NS) [16]. The output of our method, in the form of inequalities (see Section 5), 
is solved using a mathematical package (LINDO). The solution, in terms of a delay matrix, is then used to generate 
the simulation topologies for NS automatically. 

For our simulations we measured the number of responses triggered for each data packet loss. We have conducted 
two sets of simulations, each using two sets of topologies. The simulated topologies included topologies with up to 
200 nodes. The first set of topologies was generated according to the overhead analysis presented in this paper. We 
call this set of topologies the stress topologies. An example stress topology is shown in Figure 5. The second set 
of topologies was generated by the GT-ITM topology generator [17], generating both flat random and transit stub 
topologies 11 . We call this set of topologies the random topologies 12 . 

The first set of simulations was conducted for the SRM deterministic timers 13 . The results of the simulation are 
shown in Figure 6. The number of responses triggered for all the stress topologies was n — 1, where n is the number of 
nodes in the topology (i.e., no suppression occurred). For the random topologies, the number of responses triggered 
was almost 20 responses in the worst case. 

Using the same two sets of topologies, the second set of simulations was conducted for the SRM adaptive timers 14 . 
The results are given in Figure 6. For the stress topologies almost 50% of the nodes in the topology triggered 
responses. Whereas random topologies simulation generated almost 10 responses in the worst case. 

11 The topology generator is probably representative of a standard tool for topology generation used in networking research. 
Using GT-ITM we have covered most topologies used in several SRM studies [18] [19]. 

12 We faced difficulties when choosing the lossy link for the random topologies in order to maximize the number of responses. 
This is an example of the difficulties networking researchers face when trying to stress networking protocols in an ad-hoc way. 
13 SRM response timer values are selected randomly from the interval [Di.d r ,(Di + D2)-d r ], where d r is the estimated distance 
to the requester, and Di, D2 depend on the timer type. For deterministic timers D2 = and D\ = 1. 

14 Adaptive timers adjust their interval based on the number of duplicate responses received and the estimated distance to the 
requester. 
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5 ms S: Source 

105 ms Q: Requester 

Figure 5 An example 6 node stress topology used for the simulation. 




Figure 6 Simulation results for deterministic and adaptive timers over stress and random topologies. 



These simulations illustrate how our method may be used to generate consistent worst-case scenarios in a scalable 
fashion. It is interesting to notice that worst-case topologies generated for simple timers also experienced substantial 
overhead (perhaps not the worst, though) for more complicated timers (such as the adaptive timers). It is also 
obvious from the simulations that stress scenarios are more consistent than the other scenarios when used to compare 
different mechanisms, in this case deterministic and adaptive timers; the performance gain for adaptive timers is very 
clear under stress scenarios. 

So, in addition to experiencing the worst-case behavior of a mechanism, our stress methodology may be used to 
compare protocols in the above fashion and to aid in making design trade-offs. It is a useful tool for generating 
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meaningful simulation scenarios that we believe should be considered in performance evaluation of protocols in 
addition to the average case performance and random simulations. We plan to apply our method to test a wider 
range of protocols through simulation. 

8. RELATED WORK 

Related work falls mainly in the areas of protocol verification, VLSI test generation and network simulation. 

There is a large body of literature dealing with verification of protocols. Verification systems typically address well- 
defined properties -such as safety, liveness, and responsiveness [20]- and aim to detect violations of these properties. 
In general, the two main approaches for protocol verification are theorem proving and reachability analysis [21]. 
Theorem proving systems define a set of axioms and relations to prove properties, and include model-based and logic- 
based formalisms [22, 23]. These systems are useful in many applications. However, these systems tend to abstract 
out some network dynamics that we will study (e.g., selective packet loss). Moreover, they do not synthesize network 
topologies and do not address performance issues per se. 

Reachability analysis algorithms [24], on the other hand, try to inspect reachable protocol states, and suffer from 
the 'state space explosion' problem. To circumvent this problem, state reduction techniques could be used [25]. 
These algorithms, however, do not synthesize network topologies. Reduced reachability analysis has been used in 
the verification of cache coherence protocols [26], using a global FSM model. We adopt a similar FSM model and 
extend it for our approach in this study. However, our approach differs in that we address end-to-end protocols, that 
encompass rich timing, delay, and loss semantics, and we address performance issues (such as overhead or response 
delays) . 

There is a good number of publications dealing with conformance testing [27] [28] [29] [30]. However, conformance 
testing verifies that an implementation (as a black box) adheres to a given specification of the protocol by constructing 
input/output sequences. Conformance testing is useful during the implementation testing phase -which we do not 
address in this paper- but does not address performance issues nor topology synthesis for design testing. By contrast, 
our method synthesizes test scenarios for protocol design, according to evaluation criteria. 

Automatic test generation techniques have been used in several fields. VLSI chip testing [31] uses test vector 
generation to detect target faults. Test vectors may be generated based on circuit and fault models, using the 
fault-oriented technique, that utilizes implication techniques. These techniques were adopted in [15] to develop fault- 
oriented test generation (FOTG) for multicast routing. In [15], FOTG was used to study correctness of a multicast 
routing protocol on a LAN. We extend FOTG to study performance of end-to-end multipoint mechanisms. We 
introduce the concept of a virtual LAN to represent the underlying network, integrate timing and delay semantics 
into our model and use performance criteria to drive our synthesis algorithm. 

In [14], a simulation-based stress testing framework based on heuristics was proposed. However, that method does 
not provide automatic topology generation, nor does it address performance issues. The VINT [32] tools provide a 
framework for Internet protocols simulation. Based on the network simulator (NS) [16] and the network animator 
(NAM) [33], VINT provides a library of protocols and a set of validation test suites. However, it does not provide a 
generic tool for generating these tests automatically. Work in this paper is complementary to such studies, and may 
be integrated with network simulation tools as we do in Section 7. 

9. ISSUES AND FUTURE WORK 

In this paper we have presented our first endeavor to automate the test synthesis as applies to performance 
evaluation of multipoint protocols. Our case studies were by no means exhaustive, however, they gave us insights 
into the research issues involved. Future work should explore potential extensions and applications of our method. 

■ Automated generation of simulation test suites 

Simulation is a valuable tool for designing and evaluating network protocols. Researchers usually use their 
insight and expertise to develop simulation inputs and test suites. Our method may be used to assist in 
automating the process of choosing simulation inputs and scenarios. 

The inputs to the simulation may include the topology, host events (such as traffic models) , network dynamics 
(such as link failures or packet loss) and membership distribution and dynamics. 

Our future work includes implementing a more complete tool to automate our method (including search 
algorithms and modeling semantics) and tie it to a network simulator to be applied to a wider range of 
multipoint protocols. 

■ Validating protocol building blocks 
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The design of new protocols and applications often borrows from existing protocols or mechanisms. Hence, 
there is a good chance of re-using established mechanisms, as appropriate, in the protocol design process. 
Identifying, verifying and understanding building blocks for such mechanisms is necessary to increase their 
re-usability. Our method may be used as a tool to improve that understanding in a systematic and automatic 
manner. 

Ultimately, one may envision that a library of these building blocks will be available, from which protocols (or 
parts thereof) will be readily composable and verifiable using CAD tools; similar to the way circuit and chip 
design is carried out today using VLSI design tools. In this work and earlier works [15] [14], some mechanistic 
building blocks for multipoint protocols were identified, namely, the timer-suppression mechanism and the 
Join/Prune mechanism (for multicast routing). More work is needed to identify more building blocks to cover 
a wider range of protocols and mechanisms. 

■ Generalization to performance bound analysis 

An approach similar to the one we have taken in this paper may be based on some performance bounds, instead 
of worst or best case analyses. We call such approach 'condition-oriented test generation'. 
For example, a target event may be defined as 'the response time exceeding certain delay bounds' (either 
absolute or parametrized bounds). If such a scenario is not feasible, that indicates that the protocol gives 
absolute guarantees (under the assumptions of the study). This may be used to design and analyze quality- 
of-service or real-time protocols. 

■ Applicability to other problem domains 

So far, our method has been applied to case studies on multipoint protocol performance evaluation in the 
context of the Internet. 

Other problem and application domains may introduce new mechanistic semantics or assumptions about the 
system or environment. One example of such domains includes sensor networks. These networks, similar to 
ad-hoc networks, assume dynamic topologies, lossy channels, and deal with stringent power constraints, which 
differentiates their protocols from Internet protocols [34]. 
Possible research directions in this respect include: 

— Extending the topology representation or model to capture dynamics, where delays vary with time. 

— Defining new evaluation criteria that apply to the specific problem domain, such as power usage. 

— Investigating the algorithms and search techniques that best fit the new model or evaluation criteria. 

10. CONCLUSION 

We have presented a methodology for scenario synthesis for performance evaluation of multipoint protocols. We 
used a virtual LAN model to represent the underlying network topology and an extended global FSM model to 
represent the protocol mechanism. We adopted the fault-oriented test generation algorithm for search, and extended 
it to capture timing/delay semantics and performance issues for end-to-end multipoint protocols. 

Our method was applied to performance evaluation of the timer suppression mechanism; a common building 
block for various multipoint protocols. Two performance criteria were used for evaluation of the worst and best case 
scenarios; the number of responses per packet loss, and the response delay. Simulation results illustrate how our 
method can be used in a scalable fashion to test and compare reliable multicast protocols. 

We do not claim to have a generalized algorithm that applies to any arbitrary protocol. However, we hope 
that similar approaches may be used to identify and analyze other protocol building blocks. We believe that such 
systematic analysis tools will be essential in designing and testing protocols of the future. 
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Appendix: Algorithmic Details 

In this appendix we present details of inequality formulation for the end-to-end performance evaluation. In 
addition, we present the mathematical model to solve these inequalities. We also discuss the case of multiple request 
rounds for the timer suppression mechanism, and present several example case studies. 



1. DERIVING STRESS INEQUALITIES 

Given the target event, transitions are identified as either wanted or unwanted transitions, according to the 
maximization or minimization objective. For maximization, wanted transitions are those that establish conditions to 
trigger the target event, while unwanted transitions are those that nullify these conditions. 

Let W be the wanted transition and t(W) be the time of its occurrence. Let C be the condition for the wanted 
transition and t(C) is the time at which it is satisfied, and let U be the unwanted transition occurring at t(U). 

We want to establish and maintain C until W occurs, i.e., in the duration [i(C), t(W)]. Hence, U may only occur 
outside (before or after) that interval. In Figure A.l this means that U can only occur in region 1 or region 3. 



(1) 



(2) (3) 



t(C) 



t(W) 



Figure A.l The time- line for transition ordering 



Hence, the inequalities must satisfy the following 

1 the condition for the wanted transition, C, must be established before the event for the wanted transition, W , 
triggers, i.e., t(C) < t(W), and 

2 one of the following two conditions must be satisfied: 

(a) the unwanted transition, U, must occur before C, i.e., t(U) < t(C), or 

(b) the unwanted transition, U, must occur after the wanted transition, W, i.e., t(W) < t(U). 

These conditions must be satisfied for all systems. In addition, the algorithm needs to verify, using backward 
search and implication rules, that no contradiction exists between the above conditions and the nature of the events 
of the given problem. 

1.1. WORST-CASE OVERHEAD ANALYSIS 

The target event for the overhead analysis is pt. 

The objective for the worst case analysis is to maximize the number of responses p t . The wanted transition is 
transition resJmr [Res.(Dr — » D).p t ] (see Section 4). Hence t(W) = t(pt). The condition for the wanted transition 
is Dt and its time (from transition txjreq [q r .{D — > Dt)]) is t(C) = t(q r )- 

The unwanted transition is one that nullifies the condition Dt- Transition reveres [p t .(Dt — ► D)] is identified by 
the algorithm as the unwanted transition, hence t(U) = t(p r ). 

For a given system i, the inequalities become: 



t(q r ,) <t(p u ), 
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t(Cond) < t(Wanted) -> t(qrO < t(pti) 
and 

[t(Unwanted) < t(Cond) -> t(pr g ) < t(qrO 
or 

t(Wanted) < t(Unwanted) -> t(pt,) < t(pr y )] 

Figure A. 2 Formulating the inequalities automatically 



and either 

t{pr itj ) < t{q r ,) 

or 

The above automated process is shown in Figure A. 2. From the timer expiration implication rule, however, we get 
that the response time must have been set earlier by the request reception, i.e., ReSi.(Di <— DrJ-Pt; <= ^-(^Tj <— 
Di) and t(p ti ) = t(q ri ) + Expi. Hence, t(q ri ) < t(pti) is readily satisfied and we need not add any constraints on the 
expiration timers or delays to satisfy this condition. Thus, the inequalities formulated by the algorithm to produce 
worst-case behavior are: 

or 

t( Pti ) <t(pr (J ). 

1.2. BEST-CASE ANALYSIS 

Using a similar approach to the above analysis, the algorithm identifies transition reveres \p r -(DT — » D)] as the 
wanted transition. Hence t(W) = t(p r ), and t(C) = t(q r ). The unwanted transition is transition res-tmr, and 
t(U) = t(pt). 

For system i the inequalities become: 

t(q Ti ) < t{p riJ ), 

and either 

*(Pti) < *(9r 4 ) 

or 

t(p riij ) < t(p u ). 
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But from the backward implication we have t(q Ti ) < £(pt 4 ). Hence, the algorithm encounters contradiction and 
the inequality t(p ti ) < t{q Ti ) cannot be satisfied. 

Thus, the inequalities formulated by the algorithm to produce worst-case behavior are: 

t(q r ,) < t(pr ti] ), 

and 

tiPnJ < t(p ti )- 

2. SOLVING THE SYSTEM OF INEQUALITIES 

In this section we present the general model of the constraints (or inequalities) generated by our method. As a 
first step, we form a linear programming problem and attempt to find a solution. If a solution is not found, then we 
form a mixed non- linear programming problem to get the maximum number of feasible constraints. 

In general, the system of inequalities generated by our method to obtain worst or best case scenarios, can be 
formulated as a linear programming problem. 

In our case, satisfying all the constraints regardless of the objective function, leads to obtaining the absolute 
worst/best case. For example, in the case of worst case overhead analysis, this means obtaining the scenario leading 
to no-suppression. 

The formulated inequalities by our method as given in Section 5 are as follows. 

■ for the worst case behavior: 

d.Q,i + Expi < dQj + Expj + dj t i, 

or 

dQ,i > dQj + Expj + dj,i. 

■ for the best case behavior: 

dQ,i + Expi > dgj + Expj + d j}i , 

and 

dQ,i < d Qt] + Exp j + d jti . 

The above systems of inequalities can be nicely represented by a linear programming model. The general form of 
a linear programming (LP) problem is: 

MaximizeZ = C T X = YJ Ci ■ Xi 

0<i<n 

subject to: 

AX < B 
X >0 

where Z is the objective function, C is a vector of n constants c;, X is a vector of n variables Xi, A is m x n 
matrix, and B is a vector of m elements. 

The above problem can be solved practically in polynomial time using Karmarkar [35] or simplex method [36], if 
a feasible solution exists. 

In some cases, however, the absolute worst/best case may not be attainable, and it may not be possible to find a 
feasible solution to the above problem. In such cases we want to obtain the maximum feasible set of constraints in 
order to get the worst/best case scenario. To achieve this, we define the problem as follows: 

Maximize yi 

0<i<m 
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subject to: 

IK-/i(aO<0,Vt 
Vi 6 {0, 1} 

or 

y» • (l - yi) = o 

where is the original constraint from the previous problem. 

This problem is a mixed integer non-linear programming (MINLP) problem, that can be solved using branch and 
bound methods [37]. 

3. MULTIPLE REQUEST ROUNDS 

In Section 5 we conducted the protocol overhead analysis with the assumption that recovery will occur in one 
round of request. In general, however, loss recovery may require multiple rounds of request, and we need to consider 
the request timer as well as the response timers. Considering multiple timers or stimuli adds to the branching factor 
of the search. Some of these branches may not satisfy the timing and delay constraints. It would be more efficient 
then to incorporate timing semantics into the search technique to prune off infeasible branches. 

Let us consider forward search first. For example, consider the global state qt i .Rr i having a transmitted request 
message and a request timer running. Depending on the timer expiration value Expi and the delay experienced by 
the message dij, we may get different successor states. If di,j > Expi then the request timer fires first triggering the 
event Reqi and we get qt i .Reqi as the successor state. Otherwise, the request message will be received first, and the 
successor state will be q r . .Rr t ■ Note that in this case the timer value must be decremented by dij. This is illustrated 
in figure A. 3. The condition for branching is given on the arrow of the branch, and the timer value of i is given by 
Ti. 



qti.Rn 



qci-Reqj 



qrj.Rxi 
T, = Exp, - d,j 



Figure A. 3 Forward search for multiple simultaneous events 



For backward search, instead of decreasing timer values (as is done with forward search), timer values are increased, 
and the starting point of the search is arbitrary in time, as opposed to time '0' for forward search. 

To illustrate, consider the global state having (Di <— Dt^-Rtj , with the request timer running at j and the 
response timer firing at i. 

Figure A. 4, shows the backward branching search, with the timer values at each step and the condition for each 
branch. In the first state, the timer Tq starts at an arbitrary point in time x, and the timer Ti is set to '0' (i.e. the 
timer expired triggering a response pt 4 ). One step backward, either the timer at i must have been started 'ExpQ — x' 
units in the past, or the response timer must have been started l Expi units in the past. Depending on the relative 
values of these times some branch (es) become valid. The timer values at each step are updated accordingly. Note 
that if a timer expires while a message is in flight (i.e. transmitted but not yet received), we use the m subscript to 
denote it is still multicast, as in q Tm in the figure. 
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Figure A. 4 Backward search for multiple simultaneous events 



Sometimes, the values of the timers and the delays are given as ranges or intervals. Following we present how 
branching decision are made when comparing intervals. 
Branching decision for intervals 

In order to conduct the search for multiple stimuli, we need to check the constraints for each branch. To decide 
on the branches valid for search, we compare values of timers and delays. These values are often given as intervals, 
e.g. [a, b]. 

Comparison of two intervals Int\ = [ai,6i] and Int-i = [02,62] is done according to the following rules. 

Branch Inti > Int^ becomes valid if there exists a value in [01,61] that is greater than a value in [02,62], i.e. if 
there is overlap of more than one number between the intervals. We define the '<' and '=' relations similarly, i.e., if 
there are any numbers in the interval that satisfy the relation then the branch becomes valid. 

For example, if we have the following branch conditions: (i) Expi < Expj , (ii) Expi — Expj , and (iii) Expi > Expj . 
If Expi = [3,5] and Expj = [4,6], then, according to our above definitions, all the branch conditions are valid. 
However, if Expi = [3, 5] and Expj = [5, 7], then only branches (i) and (ii) are valid. 

The above definitions are sufficient to cover the forward search branching. However, for backward search branching, 
we may have an arbitrary value x as noted above. 

For example, take the state (Di <— Dt^-Rtq- Consider the timer at Q, the expiration duration of which is ExpQ 
and the value of which is x, and the timer at i, the expiration duration of which is Expi and the value of which is '0', 
as given in figure A. 4. Depending on the relevant values of Expi and ExpQ — x the search follows some branch(es). If 
ExpQ — [01, 61], then x = [0, 61] and ExpQ — x = [0, 61]. Hence, we can apply the forward branching rules described 
earlier by taking Expq — x — [0, 61], as follows. Since Expi — [02,62], where 02 > and 62 > 0, hence, the branch 
condition Expi > ExpQ — x is always true. The condition Expi — ExpQ — x is valid when: (i) Expi = ExpQ, or (ii) 
Expi < ExpQ. The last condition, Expi < ExpQ — x, is valid only if Expi < ExpQ. 

These rules are integrated into the search algorithm for our method to deal with multiple stimuli and timers 
simultaneously. 

4. EXAMPLE CASE STUDIES 

In this section, we present several case studies that show how to apply the previous analysis results to examples 
in reliable multicast and related protocol design problems. 

4.1. TOPOLOGY SYNTHESIS 

In this subsection we apply the test synthesis method to the task where the timer values are known and the topology 
(i.e., D matrix) is to be synthesized according to the worst-case behavior. We explore various timer settings. We 
use the virtual LAN in Figure A. 5 to look at two examples of topology synthesis, one uses a timers with fixed 
randomization intervals and the other uses timers that are function of distance. 

Let Q be the requester and 1, 2 and 3 be potential responders. At time to Q sends the request. 
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Figure A. 5 The virtual LAN with 3 potential responders 



For simplicity we assume, without loss of generality, that the systems are ordered such that Vj i < V tj for i < j 
(e.g., system 1 has the least cLq^i + Expi, then 2, and then 3). Thus the inequalities Vt t < V tj + dj t i are readily 
satisfied for i < j and we need only satisfy it for i > j. 

From equation (1) for the worst-case (see Section 5) we get: 

V t2 < V tl +di, 2 , 
V t3 < V H +di,3, 

V t3 < V t2 +d 2 , 3 . (A.l) 

By satisfying these inequalities we obtain the delay settings of the worst case topology, as will be shown in the 
rest of this section. 

Timers with fixed randomization intervals. Some multicast applications and protocols (such as 
wb [4], IGMP [3] or PIM [38]) employ fixed randomization intervals to set the suppression timers. For instance, for 
the shared white board (wb) [4], the response timer is assigned a random value from the (uniformly distributed) 
interval [t,2*t] where t = 100 msec for the source arc, and 200 msec for other responders. 

Assume Q is a receiver with a lost packet. Using wb parameters we get Exp src = [100, 200] msec, and Expi = 
[200, 400] msec for all other nodes. 

To derive worst-case topologies from inequalities (A.l) we may use a standard mathematical tool for linear or 
non-linear programming, for more details see Appendix 2. However, in the following we illustrate general techniques 
that may be used to obtain the solution. 

From inequalities (A.l) we get: 

d.Q,2 + Exp 2 = V t2 < V tl + rfi,2 = d QA + Expx + di, 2 . 
This can be rewritten as 



,i + di, 2 ) < Expx - Exp 2 = diffi, 2 , 



(A.2) 



where 



I [100,200] - [200,400] = [-300,0] if 1 is src, 
[200,400] - [100,200] = [0,300] if 2 is src, 
[200,400] - [200,400] = [-200,200] Otherwise. 

Similarly, we derive the following from inequalities for Vt 3 : 

dQ,3 - (rfq,i + di, 3 ) < diffi, 3 , and 
d Q,3 - ( d Q,2 + tfo.s) < diff 2 ,3. 

If we assume system 1 to be the source, and for a conservative solution we choose the minimum value of diff, we 



get: 



Appendix A 23 




2 3 



Figure A. 6 The virtual LAN with delay assignment and labels 



min(diff 1,2) = min(di f f 1,3) = —300, 
min(diff2,3) = -200. 

We then substitute these values in the above inequalities, and assign the values of some of the delays to compute 
the others. 

Example: if we assign g?q,i = g!q,2 = g?q,3 = 100msec, we get: di,2 > 300, di,3 > 300 and 0(2,3 > 200. 
Figure A. 6 shows one possible topology to which the above assigned delays can be applied. These delays exhibit 
worst-case behavior for the timer suppression mechanism. 

Timers as function of distance. In contrast to fixed timers, this section uses timers that are function of 
an estimated distance. The expiration timer may be set as a function of the distance to the requester. For example, 
system i may set its timer to repond to a request from system Q in the interval: [C\ * Ei t Q, (Ci + C2) * Ei,q], where 
E it Q is the estimated distance/delay from i to Q, which is calculated using message exchange (e.g. SRM session 
messages) and is equal to (aVq +dQ ; i)/2. (Note that this estimate assumes symmetry which sometimes is not valid.) 

[4] suggests values for C\ and C2 as 1 or logu)G, where G is the number of members in the group. 

We take C\ — C2 — 1 to synthesize the worst-case topology. We get the expression 

Exp! - Exp 2 = [(di,q + d Q>1 )/2, di,Q + d Q>1 ] - [(d 2 ,Q + d Qt2 )/2, d 2 ,Q + d Q>2 ]. 

Example: If we assume that g!i,q — cZq,i = (fe,Q = dq,2 = 100msec, we can rewrite the above relation as 
Exp! - Exp 2 = [-100, 100] msec. 

Substituting in equation (A. 2) above, we get di,2 > 100msec. Under similar assumptions, we can obtain 0(2,3 > 
100msec, and c!i,3 > 100msec. 

Topologies with the above delay settings will experience the worst case overhead behavior (as defined above) for 
the timer suppression mechanism. 

As was shown, the inequalities formulated automatically by our method in section 5, can be used with various 
timer strategies (e.g., fixed timers or timers as function of distance). Although the topologies we have presented are 
limited, a mathematical tool (such as LINDO) can be used to obtain solutions for larger topologies. 

4.2. TIMER CONFIGURATION 

In this subsection we give simple examples of the timer configuration task solution, where the delay bounds (i.e., 
D matrix) are given and the timer values are adjusted to achieve the required behavior. 

In these examples the delay is given as an interval [x,y] msec. We show an example for worst-case analysis. 

Worst-case analysis. If the given ranges for the delays are [2,200] msec for all delays, then the term 

dQ,j — dQ z i + dj t i evaluates to [-196,398]. From equation (A. 2) above, we get 

Expi < Expj — 196, to guarantee that a response is triggered. 

If the delays are [5,50] msec, we get: 
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Expi < Expj — 45, 

i.e., i's expiration timer must be less than j's by at least 45 msecs. Note that we have an implied inequality that 
Expi > for all i. 

These timer expiration settings would exhibit worst-case behavior for the given delay bounds. 



